[D] - This might be a bad question, but is there any way to analyze the similarities in the features extracted by neural networks without knowing anything about the nature of the input data (perhaps outside the max and min allowed values)? Consider a network that pulls text from images vs an LLM
For example, consider two neural networks. One is a standard LLM like GPT, and the other can only take in image data and uses it to operate a robotic arm. For sake of argument, let's assume the robot-arm-model is trained to read instructions written down in its field of vision, which effectively means it internally must internally extract text from images.
Both of these models would have totally different input and output domains (text to text vs image to robot-arm-movements), and yet they would both likely have hidden features that correlate to similar linguistic structures. For example, they probably would both have hidden features internally that represent concepts like the number 2, since they would need to be able to perform commands that say "do XYZ 2 times"
If you only had access to these networks themselves but didn't know anything about the input or output domains, would it still be possible to realize that these networks are representing similar features internally?
submitted by /u/30299578815310
[link] [comments]
( 8
min )